Overview

Dataset statistics

Number of variables12
Number of observations5693
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory533.8 KiB
Average record size in memory96.0 B

Variable types

Numeric12

Alerts

df_index is highly correlated with customer_id and 2 other fieldsHigh correlation
gross_revenue is highly correlated with purchases_quantity and 4 other fieldsHigh correlation
recency_days is highly correlated with df_index and 2 other fieldsHigh correlation
purchases_quantity is highly correlated with gross_revenue and 2 other fieldsHigh correlation
basket_size is highly correlated with gross_revenue and 4 other fieldsHigh correlation
qt_products is highly correlated with gross_revenue and 2 other fieldsHigh correlation
max_recency is highly correlated with df_index and 2 other fieldsHigh correlation
qt_returns is highly correlated with gross_revenue and 2 other fieldsHigh correlation
purchased_returned_diff is highly correlated with gross_revenue and 3 other fieldsHigh correlation
frequency is highly correlated with purchases_quantityHigh correlation
avg_ticket is highly correlated with gross_revenue and 2 other fieldsHigh correlation
customer_id is highly correlated with df_index and 2 other fieldsHigh correlation
gross_revenue is highly skewed (γ1 = 21.65111285) Skewed
basket_size is highly skewed (γ1 = 23.05322796) Skewed
avg_ticket is highly skewed (γ1 = 53.23956349) Skewed
qt_returns is highly skewed (γ1 = 51.51976197) Skewed
df_index is uniformly distributed Uniform
df_index has unique values Unique
customer_id has unique values Unique
qt_returns has 4190 (73.6%) zeros Zeros
purchased_returned_diff has 115 (2.0%) zeros Zeros

Reproduction

Analysis started2022-11-18 03:03:20.317031
Analysis finished2022-11-18 03:03:49.840412
Duration29.52 seconds
Software versionpandas-profiling v3.4.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct5693
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2897.769542
Minimum0
Maximum5786
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-11-18T00:03:49.997426image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile292.6
Q11457
median2900
Q34342
95-th percentile5495.4
Maximum5786
Range5786
Interquartile range (IQR)2885

Descriptive statistics

Standard deviation1668.431283
Coefficient of variation (CV)0.5757639656
Kurtosis-1.195986729
Mean2897.769542
Median Absolute Deviation (MAD)1443
Skewness-0.003756457883
Sum16497002
Variance2783662.945
MonotonicityStrictly increasing
2022-11-18T00:03:50.181441image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
38911
 
< 0.1%
38671
 
< 0.1%
38661
 
< 0.1%
38651
 
< 0.1%
38641
 
< 0.1%
38631
 
< 0.1%
38621
 
< 0.1%
38611
 
< 0.1%
38601
 
< 0.1%
Other values (5683)5683
99.8%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
57861
< 0.1%
57851
< 0.1%
57841
< 0.1%
57831
< 0.1%
57821
< 0.1%
57811
< 0.1%
57801
< 0.1%
57791
< 0.1%
57781
< 0.1%
57771
< 0.1%

customer_id
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct5693
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16429.71158
Minimum12346
Maximum21997
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-11-18T00:03:50.346455image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum12346
5-th percentile12698.8
Q114288
median16227
Q318211
95-th percentile21020.8
Maximum21997
Range9651
Interquartile range (IQR)3923

Descriptive statistics

Standard deviation2563.802211
Coefficient of variation (CV)0.1560466962
Kurtosis-0.8649309179
Mean16429.71158
Median Absolute Deviation (MAD)1963
Skewness0.3130952856
Sum93534348
Variance6573081.777
MonotonicityNot monotonic
2022-11-18T00:03:50.529468image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
178501
 
< 0.1%
203991
 
< 0.1%
164981
 
< 0.1%
137451
 
< 0.1%
155841
 
< 0.1%
203771
 
< 0.1%
203761
 
< 0.1%
203751
 
< 0.1%
203741
 
< 0.1%
155781
 
< 0.1%
Other values (5683)5683
99.8%
ValueCountFrequency (%)
123461
< 0.1%
123471
< 0.1%
123481
< 0.1%
123491
< 0.1%
123501
< 0.1%
123521
< 0.1%
123531
< 0.1%
123541
< 0.1%
123551
< 0.1%
123561
< 0.1%
ValueCountFrequency (%)
219971
< 0.1%
219961
< 0.1%
219951
< 0.1%
219941
< 0.1%
219931
< 0.1%
219921
< 0.1%
219881
< 0.1%
219871
< 0.1%
219841
< 0.1%
219831
< 0.1%

gross_revenue
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct5449
Distinct (%)95.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1803.12215
Minimum0.42
Maximum279138.02
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-11-18T00:03:50.708484image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.42
5-th percentile13.128
Q1236.18
median612.78
Q31571.19
95-th percentile5326.398
Maximum279138.02
Range279137.6
Interquartile range (IQR)1335.01

Descriptive statistics

Standard deviation7884.351281
Coefficient of variation (CV)4.372610741
Kurtosis610.3336303
Mean1803.12215
Median Absolute Deviation (MAD)478.74
Skewness21.65111285
Sum10265174.4
Variance62162995.13
MonotonicityNot monotonic
2022-11-18T00:03:50.870507image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.959
 
0.2%
2.958
 
0.1%
4.958
 
0.1%
1.258
 
0.1%
12.757
 
0.1%
1.657
 
0.1%
3.757
 
0.1%
5.956
 
0.1%
4.256
 
0.1%
7.56
 
0.1%
Other values (5439)5621
98.7%
ValueCountFrequency (%)
0.421
 
< 0.1%
0.651
 
< 0.1%
0.791
 
< 0.1%
0.844
0.1%
0.853
 
0.1%
1.071
 
< 0.1%
1.258
0.1%
1.441
 
< 0.1%
1.657
0.1%
1.691
 
< 0.1%
ValueCountFrequency (%)
279138.021
< 0.1%
259657.31
< 0.1%
194550.791
< 0.1%
168472.51
< 0.1%
136275.721
< 0.1%
124564.531
< 0.1%
116729.631
< 0.1%
91062.381
< 0.1%
77183.61
< 0.1%
72882.091
< 0.1%

recency_days
Real number (ℝ≥0)

HIGH CORRELATION

Distinct304
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean116.8115229
Minimum0
Maximum373
Zeros38
Zeros (%)0.7%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-11-18T00:03:51.170520image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q122
median71
Q3199
95-th percentile338
Maximum373
Range373
Interquartile range (IQR)177

Descriptive statistics

Standard deviation111.591303
Coefficient of variation (CV)0.9553107448
Kurtosis-0.6395565525
Mean116.8115229
Median Absolute Deviation (MAD)61
Skewness0.8157757981
Sum665008
Variance12452.6189
MonotonicityNot monotonic
2022-11-18T00:03:51.331534image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1110
 
1.9%
4105
 
1.8%
398
 
1.7%
292
 
1.6%
1086
 
1.5%
882
 
1.4%
1779
 
1.4%
979
 
1.4%
778
 
1.4%
1566
 
1.2%
Other values (294)4818
84.6%
ValueCountFrequency (%)
038
 
0.7%
1110
1.9%
292
1.6%
398
1.7%
4105
1.8%
552
0.9%
778
1.4%
882
1.4%
979
1.4%
1086
1.5%
ValueCountFrequency (%)
37323
0.4%
37222
0.4%
37117
0.3%
3694
 
0.1%
36813
0.2%
36716
0.3%
36615
0.3%
36519
0.3%
36411
0.2%
3627
 
0.1%

purchases_quantity
Real number (ℝ≥0)

HIGH CORRELATION

Distinct57
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.471631829
Minimum1
Maximum206
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-11-18T00:03:51.507548image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q34
95-th percentile11
Maximum206
Range205
Interquartile range (IQR)3

Descriptive statistics

Standard deviation6.807831418
Coefficient of variation (CV)1.960988882
Kurtosis300.7030961
Mean3.471631829
Median Absolute Deviation (MAD)0
Skewness13.16111093
Sum19764
Variance46.34656861
MonotonicityNot monotonic
2022-11-18T00:03:51.681562image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12869
50.4%
2825
 
14.5%
3503
 
8.8%
4394
 
6.9%
5237
 
4.2%
6173
 
3.0%
7138
 
2.4%
898
 
1.7%
970
 
1.2%
1154
 
0.9%
Other values (47)332
 
5.8%
ValueCountFrequency (%)
12869
50.4%
2825
 
14.5%
3503
 
8.8%
4394
 
6.9%
5237
 
4.2%
6173
 
3.0%
7138
 
2.4%
898
 
1.7%
970
 
1.2%
1054
 
0.9%
ValueCountFrequency (%)
2061
< 0.1%
1981
< 0.1%
1241
< 0.1%
971
< 0.1%
912
< 0.1%
861
< 0.1%
721
< 0.1%
622
< 0.1%
601
< 0.1%
571
< 0.1%

basket_size
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct1843
Distinct (%)32.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean978.8603548
Minimum1
Maximum196844
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-11-18T00:03:51.851576image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q1106
median317
Q3806
95-th percentile2943.4
Maximum196844
Range196843
Interquartile range (IQR)700

Descriptive statistics

Standard deviation4429.497205
Coefficient of variation (CV)4.525157427
Kurtosis785.219949
Mean978.8603548
Median Absolute Deviation (MAD)253
Skewness23.05322796
Sum5572652
Variance19620445.49
MonotonicityNot monotonic
2022-11-18T00:03:52.016588image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1114
 
2.0%
272
 
1.3%
351
 
0.9%
449
 
0.9%
535
 
0.6%
629
 
0.5%
1225
 
0.4%
8822
 
0.4%
7221
 
0.4%
720
 
0.4%
Other values (1833)5255
92.3%
ValueCountFrequency (%)
1114
2.0%
272
1.3%
351
0.9%
449
0.9%
535
 
0.6%
629
 
0.5%
720
 
0.4%
818
 
0.3%
97
 
0.1%
1017
 
0.3%
ValueCountFrequency (%)
1968441
< 0.1%
809971
< 0.1%
801791
< 0.1%
773731
< 0.1%
742151
< 0.1%
699931
< 0.1%
645491
< 0.1%
641241
< 0.1%
633121
< 0.1%
583431
< 0.1%

qt_products
Real number (ℝ≥0)

HIGH CORRELATION

Distinct529
Distinct (%)9.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean92.61233093
Minimum1
Maximum7838
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-11-18T00:03:52.196607image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q114
median41
Q3106
95-th percentile332.4
Maximum7838
Range7837
Interquartile range (IQR)92

Descriptive statistics

Standard deviation210.2090138
Coefficient of variation (CV)2.269773492
Kurtosis508.9425687
Mean92.61233093
Median Absolute Deviation (MAD)33
Skewness17.7059352
Sum527242
Variance44187.82948
MonotonicityNot monotonic
2022-11-18T00:03:52.367617image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1256
 
4.5%
2148
 
2.6%
3110
 
1.9%
10101
 
1.8%
699
 
1.7%
992
 
1.6%
590
 
1.6%
486
 
1.5%
784
 
1.5%
1183
 
1.5%
Other values (519)4544
79.8%
ValueCountFrequency (%)
1256
4.5%
2148
2.6%
3110
1.9%
486
 
1.5%
590
 
1.6%
699
 
1.7%
784
 
1.5%
880
 
1.4%
992
 
1.6%
10101
 
1.8%
ValueCountFrequency (%)
78381
< 0.1%
55891
< 0.1%
50951
< 0.1%
45801
< 0.1%
26981
< 0.1%
23791
< 0.1%
20601
< 0.1%
18181
< 0.1%
16731
< 0.1%
16371
< 0.1%

avg_ticket
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct5501
Distinct (%)96.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54.6078878
Minimum0.42
Maximum77183.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-11-18T00:03:52.544643image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.42
5-th percentile3.459660952
Q17.95
median15.84941176
Q321.96956522
95-th percentile76.32
Maximum77183.6
Range77183.18
Interquartile range (IQR)14.01956522

Descriptive statistics

Standard deviation1281.784009
Coefficient of variation (CV)23.47250664
Kurtosis2950.232891
Mean54.6078878
Median Absolute Deviation (MAD)7.485911765
Skewness53.23956349
Sum310882.7053
Variance1642970.246
MonotonicityNot monotonic
2022-11-18T00:03:52.722658image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3.7511
 
0.2%
4.9510
 
0.2%
1.259
 
0.2%
2.959
 
0.2%
7.958
 
0.1%
1.657
 
0.1%
8.257
 
0.1%
12.757
 
0.1%
3.356
 
0.1%
4.156
 
0.1%
Other values (5491)5613
98.6%
ValueCountFrequency (%)
0.423
0.1%
0.5351
 
< 0.1%
0.651
 
< 0.1%
0.791
 
< 0.1%
0.83714285711
 
< 0.1%
0.842
< 0.1%
0.853
0.1%
1.0022222221
 
< 0.1%
1.021
 
< 0.1%
1.038751
 
< 0.1%
ValueCountFrequency (%)
77183.61
< 0.1%
56157.51
< 0.1%
13305.51
< 0.1%
4453.431
< 0.1%
38611
< 0.1%
3202.921
< 0.1%
30961
< 0.1%
1687.21
< 0.1%
1377.0777781
< 0.1%
1001.21
< 0.1%

max_recency
Real number (ℝ≥0)

HIGH CORRELATION

Distinct365
Distinct (%)6.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean152.4753206
Minimum0
Maximum373
Zeros4
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-11-18T00:03:52.898660image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile21
Q167
median135
Q3225
95-th percentile351
Maximum373
Range373
Interquartile range (IQR)158

Descriptive statistics

Standard deviation100.2344246
Coefficient of variation (CV)0.657381301
Kurtosis-0.7722855338
Mean152.4753206
Median Absolute Deviation (MAD)75
Skewness0.5157041008
Sum868042
Variance10046.93988
MonotonicityNot monotonic
2022-11-18T00:03:53.060673image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15444
 
0.8%
6344
 
0.8%
21340
 
0.7%
4237
 
0.6%
5335
 
0.6%
6435
 
0.6%
10135
 
0.6%
11934
 
0.6%
5034
 
0.6%
5634
 
0.6%
Other values (355)5321
93.5%
ValueCountFrequency (%)
04
 
0.1%
111
0.2%
27
 
0.1%
313
0.2%
418
0.3%
510
0.2%
715
0.3%
87
 
0.1%
916
0.3%
1023
0.4%
ValueCountFrequency (%)
37323
0.4%
37222
0.4%
37117
0.3%
3694
 
0.1%
36813
0.2%
36716
0.3%
36616
0.3%
36520
0.4%
36412
0.2%
3631
 
< 0.1%

qt_returns
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct214
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean46.91779378
Minimum0
Maximum80995
Zeros4190
Zeros (%)73.6%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-11-18T00:03:53.234688image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile38
Maximum80995
Range80995
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1474.979277
Coefficient of variation (CV)31.43752419
Kurtosis2717.510193
Mean46.91779378
Median Absolute Deviation (MAD)0
Skewness51.51976197
Sum267103
Variance2175563.868
MonotonicityNot monotonic
2022-11-18T00:03:53.392701image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
04190
73.6%
1169
 
3.0%
2150
 
2.6%
3105
 
1.8%
489
 
1.6%
678
 
1.4%
561
 
1.1%
1252
 
0.9%
744
 
0.8%
843
 
0.8%
Other values (204)712
 
12.5%
ValueCountFrequency (%)
04190
73.6%
1169
 
3.0%
2150
 
2.6%
3105
 
1.8%
489
 
1.6%
561
 
1.1%
678
 
1.4%
744
 
0.8%
843
 
0.8%
941
 
0.7%
ValueCountFrequency (%)
809951
< 0.1%
742151
< 0.1%
93601
< 0.1%
90141
< 0.1%
80041
< 0.1%
44271
< 0.1%
37681
< 0.1%
33311
< 0.1%
28781
< 0.1%
20221
< 0.1%

purchased_returned_diff
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct1852
Distinct (%)32.5%
Missing0
Missing (%)0.0%
Infinite12
Infinite (%)0.2%
Mean-inf
Minimum-inf
Maximum12.18870266
Zeros115
Zeros (%)2.0%
Negative12
Negative (%)0.2%
Memory size44.6 KiB
2022-11-18T00:03:53.557094image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-inf
5-th percentile1.386294361
Q14.644390899
median5.743003188
Q36.678342115
95-th percentile7.958435846
Maximum12.18870266
Rangeinf
Interquartile range (IQR)2.033951216

Descriptive statistics

Standard deviationnan
Coefficient of variation (CV)nan
Kurtosisnan
Mean-inf
Median Absolute Deviation (MAD)1.008098281
Skewnessnan
Sum-inf
Variancenan
MonotonicityNot monotonic
2022-11-18T00:03:53.730731image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0115
 
2.0%
0.693147180671
 
1.2%
1.09861228951
 
0.9%
1.38629436149
 
0.9%
1.60943791235
 
0.6%
1.79175946929
 
0.5%
2.4849066524
 
0.4%
4.47733681422
 
0.4%
4.27666611922
 
0.4%
1.94591014920
 
0.4%
Other values (1842)5255
92.3%
ValueCountFrequency (%)
-inf12
 
0.2%
0115
2.0%
0.693147180671
1.2%
1.09861228951
0.9%
1.38629436149
0.9%
1.60943791235
 
0.6%
1.79175946929
 
0.5%
1.94591014920
 
0.4%
2.07944154218
 
0.3%
2.1972245777
 
0.1%
ValueCountFrequency (%)
12.188702661
< 0.1%
11.250859161
< 0.1%
11.249584721
< 0.1%
11.142455811
< 0.1%
11.068573991
< 0.1%
11.05111221
< 0.1%
11.031788081
< 0.1%
10.968560291
< 0.1%
10.951034591
< 0.1%
10.80752351
< 0.1%

frequency
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1225
Distinct (%)21.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5471335777
Minimum0.005449591281
Maximum17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2022-11-18T00:03:53.906742image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.005449591281
5-th percentile0.01102941176
Q10.02492211838
median1
Q31
95-th percentile1
Maximum17
Range16.99455041
Interquartile range (IQR)0.9750778816

Descriptive statistics

Standard deviation0.550291935
Coefficient of variation (CV)1.005772552
Kurtosis139.1568734
Mean0.5471335777
Median Absolute Deviation (MAD)0
Skewness4.860114251
Sum3114.831458
Variance0.3028212137
MonotonicityNot monotonic
2022-11-18T00:03:54.089758image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12877
50.5%
247
 
0.8%
0.062518
 
0.3%
0.0277777777817
 
0.3%
0.0238095238116
 
0.3%
0.0833333333315
 
0.3%
0.0909090909115
 
0.3%
0.0294117647114
 
0.2%
0.0344827586214
 
0.2%
0.0212765957413
 
0.2%
Other values (1215)2647
46.5%
ValueCountFrequency (%)
0.0054495912811
 
< 0.1%
0.0054644808741
 
< 0.1%
0.0054794520551
 
< 0.1%
0.0054945054951
 
< 0.1%
0.0055865921792
< 0.1%
0.0056022408961
 
< 0.1%
0.0056179775282
< 0.1%
0.005665722381
 
< 0.1%
0.0056818181822
< 0.1%
0.0056980056983
0.1%
ValueCountFrequency (%)
171
 
< 0.1%
41
 
< 0.1%
35
 
0.1%
247
 
0.8%
1.1428571431
 
< 0.1%
12877
50.5%
0.751
 
< 0.1%
0.66666666673
 
0.1%
0.5508021391
 
< 0.1%
0.53083109921
 
< 0.1%

Interactions

2022-11-18T00:03:47.003187image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:21.404118image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:24.336356image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:26.948568image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:28.965727image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:32.092982image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:33.930140image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:35.896288image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:38.089467image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:40.810687image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:43.077867image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:45.032025image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:47.197200image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:21.738146image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:24.496368image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:27.100580image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:29.132740image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:32.237992image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:34.095144image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:36.060302image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:38.524502image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:40.975700image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:43.247880image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:45.180039image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:47.345212image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:21.924162image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:24.651381image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:27.264590image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:29.299754image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:32.375003image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:34.240155image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:36.275319image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:38.721517image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:41.144711image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:43.422897image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:45.325051image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:47.969262image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:22.093175image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:24.895402image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:27.444604image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:29.459769image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:32.507025image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:34.385165image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:36.453333image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:39.022539image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:41.304728image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:43.597911image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:45.473062image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:48.119274image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:22.436204image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:25.228429image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:27.610621image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:29.644782image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:32.651037image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:34.538190image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:36.620345image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:39.269562image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:41.469738image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:43.765923image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:45.625075image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:48.265287image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:22.742228image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:25.426444image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:27.784635image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:29.809798image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:32.816041image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:34.686189image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:36.769359image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:39.491579image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:41.670755image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:43.911934image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:45.764084image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:48.423298image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:22.964247image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:25.688465image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:27.969646image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:30.000814image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:32.978053image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:34.848214image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:36.969378image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:39.770602image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:41.985780image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:44.076947image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:45.918097image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:48.578314image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:23.275269image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:25.990487image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:28.132663image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:30.178828image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:33.155067image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:35.008215image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:37.168392image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:39.956619image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:42.173798image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:44.244962image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:46.067111image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:48.736325image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:23.492285image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:26.171501image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:28.299677image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:31.430929image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:33.309081image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:35.174231image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:37.353405image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:40.134631image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:42.373810image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:44.407974image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:46.319129image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:48.894350image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:23.677301image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:26.349519image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:28.472689image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:31.603944image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:33.485093image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:35.337243image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:37.533423image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:40.302644image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:42.549827image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:44.562987image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:46.497155image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:49.048350image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:23.975326image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:26.581537image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:28.650706image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:31.778957image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:33.638105image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:35.500256image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:37.748439image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:40.483660image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:42.738843image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:44.713999image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:46.650167image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:49.192363image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:24.175342image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:26.774554image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:28.809717image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:31.928968image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:33.781127image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:35.672270image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:37.915450image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:40.636673image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:42.907855image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:44.873013image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-18T00:03:46.793169image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-11-18T00:03:54.243769image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-11-18T00:03:54.501790image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-11-18T00:03:54.755810image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-11-18T00:03:54.998832image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-11-18T00:03:49.416380image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-11-18T00:03:49.720404image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexcustomer_idgross_revenuerecency_dayspurchases_quantitybasket_sizeqt_productsavg_ticketmax_recencyqt_returnspurchased_returned_difffrequency
00178505391.21000372.0000034.000001733.00000297.0000018.15222372.0000040.000007.4342617.00000
11130473232.5900056.000009.000001390.00000171.0000018.9040471.0000035.000007.211560.02830
22125836705.380002.0000015.000005028.00000232.0000028.9025073.0000050.000008.512780.04032
3313748948.2500095.000005.00000439.0000028.0000033.86607137.000000.000006.084500.01792
4415100876.00000333.000003.0000080.000003.00000292.00000333.0000022.000004.060440.07317
55152914623.3000025.0000014.000002102.00000102.0000045.3264778.0000029.000007.636750.04011
66146885630.870007.0000021.000003621.00000327.0000017.2197948.00000399.000008.077760.05722
77178095411.9100016.0000012.000002057.0000061.0000088.7198470.0000041.000007.608870.03352
881531160767.900000.0000091.0000038194.000002379.0000025.5434621.00000474.0000010.537950.24332
99160982005.6300087.000007.00000613.0000067.0000029.9347887.000000.000006.418360.02439

Last rows

df_indexcustomer_idgross_revenuerecency_dayspurchases_quantitybasket_sizeqt_productsavg_ticketmax_recencyqt_returnspurchased_returned_difffrequency
56835777219884839.420001.000001.000001074.0000062.0000078.055161.000000.000006.979151.00000
5684577813298360.000001.000001.0000096.000002.00000180.000001.000000.000004.564351.00000
5685577914569227.390001.000001.0000079.0000012.0000018.949171.000000.000004.369451.00000
568657802199217.900001.000001.0000014.000007.000002.557141.000000.000002.639061.00000
56875781219933.350001.000001.000002.000002.000001.675001.000000.000000.693151.00000
56885782219945699.000001.000001.000001747.00000634.000008.988961.000000.000007.465661.00000
56895783219956756.060000.000001.000002010.00000730.000009.254880.000000.000007.605891.00000
56905784219963217.200000.000001.00000654.0000059.0000054.528810.000000.000006.483111.00000
56915785219973950.720000.000001.00000731.00000217.0000018.206080.000000.000006.594411.00000
5692578612713794.550000.000001.00000505.0000037.0000021.474320.000000.000006.224561.00000